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Abstract 

Using the electric and coupling approaches, we derive a series of results concerning the 
mixing times for the stratified random walk on the d-cube, inspired in the results of Chung and 
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11,199-222. 
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1 Introduction. 



The stratified random walk (SRW) on the d-cube Qd is the Markov chain whose state space is 
the set of vertices of the d-cube and whose transition probabilities are defined thus: 

Given a set of non-zero probabilities p = (po,Pi, ■ ■ ■ ,Pd~i), from any vertex with k l's, 
the process moves either to any neighboring vertex with k + 1 l's with probability or to 
any neighboring vertex with k — 1 l's with probability ^p-; or to itself with the remaining 
probability. The simple random walk on the d-cube corresponds to the choice p^ = \ for all k. 

Vaguely speaking, the mixing time of a Markov chain is the time it takes the chain to have 
its distribution close to the stationary distribution under some measure of closeness. Chung 
and Graham studied the SRW on the <i-cube in ||, mainly with algebraic methods, and found 
bounds for the mixing times under total variation and relative pointwise distances. Here we 
use non-algebraic methods, the electric and coupling approaches, in order to study the same 
SRW and get exact results for maximal commute times and bounds for cover times and mixing 
times under total variation distance. We take advantage of the fact that there seems to be some 
inequality or another linking hitting times, commute times, cover times and any definition of 
mixing time with any other under any measure of closeness (see Aldous and Fill ^ and Lovasz 
and Winkler §). 

2 The electric approach 

On a connected undirected graph G = (V, E) such that the edge between vertices i and j is 
given a resistance rjj (or equivalently, a conductance Cij = l/r^), we can define the random 
walk on G as the Markov chain X = {X(n)} n >o that from its current vertex v jumps to the 
neighboring vertex w with probability p^w 

= C vw /C(v), where C(v) = Y^w.w~v ^vw, and w ~ v 
means that w is a neighbor of v. There may be a conductance C zz from a vertex z to itself, 
giving rise to a transition probability from z to itself. Some notation: E a Tfc and E a C denote 
the expected value, starting from the vertex a, of respectively, the hitting time Tf, of the vertex 
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b and the cover time C, i. e., the number of jumps needed to visit all the states in V; R a b is the 
effective resistance, as computed by means of Ohm's law, between vertices a and b. 

A Markov chain is reversible if 7TjP(i, j) = irj¥(j,i) for all where {-7r } is the stationary 
distribution and P(-, •) are the transition probabilities. Such a reversible Markov chain can be 
described as a random walk on a graph if we define conductances thus: 

Cy =7r;P(i,j). (2.1) 

We will be interested in finding a closed form expression for the commute time EoT^ + Pcf^o 
between the origin, denoted by 0, and its opposite vertex, denoted by d. 

Notice first that the transition matrix for X = {X(n),n > 0}, the SRW on the d-cube, 
is doubly stochastic and therefore its stationary distribution is uniform. If we now collapse 
all vertices in the cube with the same number of l's into a single vertex, and we look at the 
SRW on this collapsed graph, we obtain a new reversible Markov chain S = {S(n),n > 0}, a 
birth-and-death chain in fact, on the state space {0, 1, . . . d}, with transition probabilities 



P(fc,fc+1) = —ft, (2.2) 

P(M-l) = ^Pk-l, (2.3) 
P(k,k) = 1-F(k,k + 1) -F(k,k- 1). (2.4) 

It is plain to see that the stationary distribution of this new chain is the Binomial with 
parameters d and ^. It is also clear that the commute time between vertices and d is the 
same for both X and S. For the latter we use the electric machinery described above, namely, 
we think of a linear electric circuit from to d with conductances given by (|2.1| ) for < i < d, 
j = i — 1, i, i + 1, and where 7Tj = 

It is well known (at least since Chandra et al. proved it in Q) that 

E a T b + E a T b = R ab J2C(z), (2.5) 

z 

where R ab is the effective resistance between vertices a and b. 
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If this formula is applied to a reversible chain whose conductances are given as in (|2.l| ), then 
it is clear that 

C(Z) = 7T Z 

and therefore the summation in (2.5) equals 1. We get then this compact formula for the 
commute time: 

E a T b + E a T b = R ab , (2.6) 

where the effective resistance is computed with the individual resistors having resistances 

1 1 

In our particular case of the collapsed chain, because it is a linear circuit, the effective 



resistance Rod equals the sum of all the individual resistances r^j+i, so that (|2.6|) yields 

d-l 

E T d + E d T = R od = 2 d J2 —jzK • ( 2 - 7 ) 

k=0 Pk \ k ) 

Because of the particular nature of the chain under consideration, it is clear that EoT d +E d To 
equals the maximal commute time (r* in the terminology of Aldous Q) between any two vertices. 

(i) For simple random walk, formula ( |2.7| ) is simplified by taking all p& = 1. This particular 
formula was obtained in || with a more direct argument, and it was argued there that 

d-l 1 

Et^iT = * + oil). 

k=0 V k ) 

An application of Matthews' result (see Q), linking maximum and minimum expected hitting 
times with expected cover times, yields immediately that the expected cover time is E V C = 
6(| V| log |V|), which is the asymptotic value of the lower bound for cover times of walks on a 
graph G = (V, E) (see ||). Thus we could say this SRW is a "rapidly covered" walk. 

(ii) The so-called Aldous cube (see || ) corresponds to the choice pk = jzt • This walk takes 



place in the "punctured cube" that excludes the origin. Formula ( p.7|) thus, must exclude k = 



in this case, for which we still get a closed-form expression for the commute time between vertex 
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d, all of whose coordinates are 1, and vertex s which consists of the collapse of all vertices with 
a single 1: 



a— i ^ 



E s T d + E d T s 

k=l \k-U 

The same argument used in (i) tells us that the summation in ( |2.8| ) equals 2 + o(l) and, once 
again, Matthews' result tells us that the walk on the Aldous cube has a cover time of order 

m login 

(iii) The choice pk = /d-i\ would be in the terminology of || a "slow walk" : the commute 

V k ) 

time is seen to be exactly equal to |V| log 2 | V| and thus the expected cover time isO(|y|log 2 |V|). 

In general, the SRW will be rapidly covered if and only if 

d-l ^ 

k=0 Pk \ k ) 

for some constant c. 



Remark. A formula as compact as ( |2.7D could be easily obtained through the commute 
time formula (|2.6|). It does not seem that it could be obtained that easily, by just adding the 



individual hitting times EjTj+i. (A procedure that is done, for instance, in ||, [1C], |11], and 
in the next section). 



3 The coupling approach 



In order to asess the rate of convergence of the SRW on the cube Qd to the uniform stationary 
distribution it, we will bound the mixing time r defined as 

r = min{t : d(t') < — , for all t' > t}, 
2e 



where 



d(t) = max ||P x (X(t) = •) - vr(- 



and ||^i — 82 1 1 is the variation distance between probability distributions 9\ and 82, one of whose 
alternative definitions is (see Aldous and Fill Q), chapter 2): 

\\e 1 -e 2 \\=mm¥(Vi^V 2 ), 



where the minimum is taken over random pairs (Vx, V2) such that V m has distribution 8 m , m = 
1,2. 

The bound for the mixing time is achieved using a coupling argument that goes as follows: 
let {X(i),i > 0} and {Y(i),i > 0} be two versions of the SRW on Qd such that X(0) = x and 
Y(0) ~ 7T. Then 

||P x (X(t) = .)-7r(.)||<P(X(*)^Y(t)). (3.1) 

A coupling between the processes X and Y is a bivariate process such that its marginals 
have the distributions of the original processes and such that once the bivariate process enters 
the diagonal, it stays there forever. If we denote by 

r x = inf{t;X(i) = Y(t)} 

the coupling time, i. e., the hitting time of the diagonal, then ( |Q| ) translates as 

||P x (X(t) = -)-vr(.)|| <P(T x >t), (3.2) 

and therefore, 

d(t) < maxP(T x > t). (3.3) 
xeQ d 

If we can find a coupling such that ET X = 0(f(d)), for all x £ Qd and for a certain function / 
of the dimension d, then we will also have r = 0(f(d)). Indeed, if we take t = 2ef(d), then ( |3.3| ) 
and Markov's inequality imply that d(t) < l/2e and the definition of t implies r = 0(f(d)). 

We will split T x as T x = T^ + T^, where is a coupling time for the birth-and-death process 
S, and is another coupling time for the whole process X, once the bivariate S process enters 
the diagonal, and we will bound the values of ET^ and ET^. 

More formally, for any x, y G Qd define 

d 

5 (x) = ^x 4 (3.4) 
i=i 

d 

d(x,y) = ^2 \ x i ~ ( 3 - 5 ) 
i=l 
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Define also for the birth-and-death process S(i) = s(X(i)) its own mixing time: 

=M{t;d s (t)<±}, 

where ds(t) = maxj ||Pj(S(t) = •) — 7rs(-)||, and its is the stationary distribution of S. Notice 
that s(Y(0)) ~ ir s since Y(0) ~ vr. 

Now we will prove that = 0(fs(d)), for a certain function f$ of the expected hitting 
times of the "central states", and that this bound implies an analogous bound for ET X . Indeed, 
as shown by Aldous JD], we can bound by a more convenient stopping time 

r (5) < K 2 r[ 3) (3.6) 

(3) 

where t\ = min^ maxj min^ Ejt/j and the innermost minimum is taken over stopping times 
Ui such that ¥i(S(Ui) G •) = //(•). In particular, 

r} 3 '* < min max min Ej C/f (3-7) 

i 

< minmaxEjT(, (3-8) 

b i 

= max(E r d/2 ,E d r d/2 ) (3.9) 
where the innermost minimum in (3.7) is taken over stopping times U\ such that Fi(S(Ui) = 



b) = 1. Expression ( |3,9|) follows from (|3.q ) since we are dealing with birth and death chains. 
Therefore, combining ([O]) and ( |3.9[) we have 



< K 2 max(E T d/2 , E d T d/2 ) := / 5 (d). (3.10) 



In general, a coupling implies an inequality like ( |3.2[) . However, the inequality becomes an 
equality for a certain maximal (non-Markovian) coupling, described by Griffeath Q. Let be 
the coupling time for the maximal coupling between s(X(£)) and s(Y(t)) such that 

||P x ( 1 S(X(t)) = .)-vr 5 (-)||=P(T x 1 >t). 

Then 

d s (t) = max ||P x (5(X(t)) = •) -7r s (-)|| = maxP(T^ > t). 

xeQ d xgQ d 



By the definition of it is clear that -P(T X > t^) < Moreover, by the "submultiplica- 
tivity" property (see Aldous and Fill [p[, chapter 2) 



we have that 



Thus 



d(s + t) < 2d(s)d(t), s,t>0, (3.11) 
P(T X > kT&) < ^fe, fc>l. (3.12) 



ET X = J^E(T X 1 1((A; - l)r( s ) < T x x < kr^)) 

k=l 

oo 

< r( s ) £ jfeP((jfc - l)r( s ) < T x < jfer^) 

fc=i 

oo 

< fcP((fc - l)r( 5 ) <T X X ) 
fc=i 

V fc=2 / 



Since the series in the right hand side converges we have 

ET X = 0(/ s (d)). 
Once the bivariate S process hits the diagonal 

d d 

D = {(x, y) £ Qd.x Q d ; Y. Xi = Y. ( 3 - 13 ) 

i=i i=i 

we devise one obvious coupling that forces the bivariate X process to stay in D and such that the 
distance defined in ( p.5| ) between the marginal processes does not decrease. In words: we select 
one coordinate at random; if the marginal processes coincide in that coordinate, we allow them 
to evolve together; otherwise we select another coordinate in order to force two new coincidences. 
Formally, for each (X(t), Y(i)) € D, let Ji, Ijj an d ^3 be the partition of {0, 1, ... , d} such that 

h = {i;X l (t)=Y i (t)} 

h = {i;X i (t) = 0,Y i (t) = l} 

h = {i;X i (t) = l,Y i (t)=0} 



Given (X(i), Y(t)) G D, choose i u.a.r. from {0, 1, . . . ,d}. 

(a) If j G Ji; 

1. If = 1 then make + 1) = Yj(i + 1) = with probability p s (x(t))-i > otherwise 
Xi(t + 1) = Y;(i + 1) = 1. 

2. If Xi(t) = then make Xi(t + 1) = Yj(i + 1) = 1 with probability p s (x(i))j otherwise 
Xi(t + 1) =Yi(t + l) = 0. 

(b) If*G/ 2 ; 

1. Select j G ^3 u.a.r.; 

2. Make + 1) = Yj(t + 1) = 1 with probability p s (x(i))! otherwise + 1) = 
^(t + l) = 0. 

(c) Ifie J 3 ; 

1. Select j G -Z2 u.a.r.; 

2. Make + 1) = Yj(t + 1) =0 with probability p s (x(t))-i! otherwise Xj(i + 1) = 

y i (t + i) = i. 

Then, it is easy to check that (X(t + l),Y(t + 1)) G D and d(X(t + l),Y(t + 1)) < 
d(X(t),Y(t)). Moreover, noticing that |J 2 | = |I 3 | = d(X(t), Y(t))/2, we have 

P(d(X(t + l),Y(t + l)) = d(X(t),Y(t))-2 I X(t),Y(t)) = d(X( ^ Y(f)) (p«(Xffl) +p a(X (*))-l) 

(3.14) 

P(d(X(t + l),Y(t + l)) = d(X(i),Y(t)) I X(t),Y(t)) = 1 - d(X( ^ Y(t)) (p a (x(t))+P a (X(t))-i)- 

(3.15) 

In this case, it is straightforward to compute 

m(t,a(X(t))) = i-E[d(X(t + l),Y(i + l)) | d(X(t),Y(i)) = i J X(t),Y(i)] 

= -^(Ps(X(t)) +P a (X(*))-l)- ( 3 - 16 ) 

Let be the coupling time for the second coupling just described. That is, let = 
inf{i > : d(X(i), Y(i)) = 0}. Then, as a consequence of the optional sampling theorem for 
martingales we have the following comparison lemma (c/. Aldous and Fill Chapter 2). 
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Lemma 3.17 

L , 

E[T x 2 KX(r x 1 ),Y(r x 1 )) = L,(X(T x 1 ),Y(T x 1 ))GD, S (X(T x 1 )) = S ] < V (3.18) 
/or oZZ s = 0, 1, . . . , d. 

Proof. Define (X'(t), Y'(t)) = (X(t + T*), Y(t + 7£) for all t > 0. Define Z(t) = d(X'(t), Y'(t)) 
and ^ = cr(X'(f), Y'(t)). Then, it follows from ( |3~16|) that 

m(i, s) = i - E[Z(1)\Z(0) = i, s(X'(0)) = s]. (3.19) 

Also, for all s G {1, . . . , d - 1}, < m(l, s) < m(2, s) < . . . < m(d, s). Fix s e {1, . . . , d - 1} 
and write 

and extend /i by linear interpolation for all real < x < d. Then h is concave and for alH > 1 

E[h(Z(l)) | Z(0) = i,s(X'(0)) = s] < h(i-m(i,s)) 

< h(i) — m(i, s)h'(i) 
= /»(»)- 1, 

where the first inequality follows from the concavity of and h' is the first derivative of h. Now, 
defining /i such that 

h(i) =1 + 2^ P[^(^(l)) I Z{0) = i, s(X'(0)) = s]/i(j) + fc(t) (3.21) 

and 

t-l 

M(t) = t + ^(Z(t))+^^( Z W) ( 3 - 22 ) 

we have that M is an J^-martingale and applying the optional sampling theorem to the stopping 
time To = inf{t; Z(t) = 0} we have 



E[M (T ) | Z(0) = i, s(X'(0)) = s] = E[M(0) | Z(0) = i, s(X'(0)) = s] = (3.23) 
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Noticing that M{Tq) > Tq and To = T x , we obtain the desired result • 
Since s(X(t)) is distributed as its, we can write: 

d L , 

(3.24) 

Putting the pieces together, we have found a coupling time T x for the whole process such 
that 

ET x <f s (d)+g(d). 

The task now is to find explicit bounds for fs(d) and g{d) for particular workable cases. 

To avoid unnecessary complications, we will assume d = 2m, and compute only the hitting 
times for the S process of the type EoT m . Hitting times in birth-and-death processes assume 
the following closed- form (see [JTTj] for an electrical derivation): 



1 k 

E k T k+1 = — — — J^n, 0<k<d-l, 
TT k P{k,k + l) ^ 

and in our case this expression turns into 

Ek /d\ 

( k )Pk 

Therefore 

m— 1 ^k l2m\ 

^T m = E (3-25) 

(i) In case all = 1, we have the simple random walk on the cube, and it turns out there 
is an even more compact expression of ( p. 25 ), namely: 



m— 1 \-^k Ilm\ m—l 

E T m = 



y = m y -J—, (3.26) 



/ j /2m— 1\ 
fc=0 v fc I 



k=0 



as was proved in Q, and the right hand side of ( |3.26 ) equals m[H(2m) H(m)], where 



2 

H(n) = 1 + ^ + • • • + — , allowing us to conclude immediately that in this case EoT m = EoT^/ 2 
|logd+| log 2. 
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Also, we have 

E^KX^YCTi)) = L,(X(T x 1 ),Y(Ti)) G D] < ^ ^ ± « §°( lo S L )- ( 3 - 27 ) 



Thus in this case both fs(d) and 5(d), and a fortiori ET X and r, are 0(dlogd). 

(ii) For the Aldous cube, pk = ^ry, and ( 3.25|) becomes (recall this cube excludes the origin): 



m—lsr^k (2m\ m—2-^k (2m\ m—2 / 2m \ 

j? T _ 2^i=o { j ) _ sr^ 2^i=o { j ) \^ \k+v /q 9a \ 

^l-Lm - (2m-2\ ~ (2m-2\ h 2^ (2m-2\ ' 

k=l \ k-1 ) k=0 \ k ) k=0 V k ) 

After some algebra, it can be shown that the second summand in (|3.2§| ) equals (2m — 

l)[H(2m — 1) ], and the first summand can be bounded by twice the expression in ( 3.26 ), 

m 

(2m — 1\ (2m — 2\ 
on account of the fact that ( I < 21 I , for < k < m — 1. Therefore, we can 

\ k J \ k J 

write 

3 

^\Td/2 < ijdlogd + smaller terms, 

thus improving by a factor of ^ the computation of the same hitting time in ||. 
Also, we have 

E[T X 2 KX(T X ), Y (T X )) = L, (X(T X ), Y(T X )) G D, S (X(T X )) = s] < £ (3.29) 

Thus, in this case 

E[T X 2 KX(T X ),Y(T X )) = d, (X(T X ), Y(T X )) G D] 

< *(-v^/3)£^l) + (i-<6(-v5/3))x; d{d ~ l) 



i=t i(2d/S-l) 

e- d / 9 d{d- l)logd+ (1 - e~ d / 9 )d\ogd. (3.30) 



And so r = 0(dlogd) also in this case. 
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(iii) Slower walks. Consider the case when the probability pk grows exponentially in k, more 
specifically 

Pk=(—r) (3-31) 



n + 1 



with a > 1. In this case, it seems that ( 3.25 ) is useless to get a closed expression for 
However, Graham and Chung || provide the following bound 

^iT d /2 < c {a)d a , for all d > d {a), < % < d (3.32) 

where co(a) and do(a) are constants depending only on a. Moreover, (3.24) becomes 



9(d) 



d(d+ l) a 
^ i((s + l) a + s a ) 



d(d+l) a J2 



7T S {S) 



d 



, y s + l) a + s a *r-! i 

s=0 x ' i=l 



< 



d(d + 1) Q log d^ 

s=0 
d 

d(d+l) a logdY^ 



ir s (s) 



(s + 1)" + s c 



s=0 



(s + iy 



d(d +l) a log dE 



(3.33) 



.(i + xy 

where X is a Binomial (d, |) random variable. Jensen's inequality and the same argument that 
lead to flOQl) show that E[(l + X)" Q ] ~ 0(d~ a ) and (gj|) can be bound by O(dlogd). This 
fact together with (3.32) allows us to conclude that r = 0(d a ) in this case, thus improving on 
the rate of the mixing time provided in H by a factor of log d. 
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